Multilevel Linear Dimensionality Reduction for Data Analysis using Nearest-Neighbor Graphs
نویسندگان
چکیده
Dimension reduction techniques can be time-consuming when the data set is large. This paper presents a multilevel framework to reduce the size of the data set, prior to performing dimension reduction. The algorithm exploits nearestneighbor graphs. It recursively coarsens the data by finding a maximal matching level by level. Once the coarse graph is available, the coarsened data is projected at the lowest level using a known linear dimensionality reduction method. To obtain the projected data at the highest level, the same linear mapping as that of the lowest level is performed on the original data set, and on any new test data. The methods are illustrated on three applications: manifold mapping, face recognition, and text mining. Experimental results indicate that the multilevel techniques presented in this paper offer a very appealing cost to quality ratio.
منابع مشابه
Feature extraction for nearest neighbor classification: Application to gender recognition
In this article, we perform an extended analysis of different face-processing techniques for gender recognition problems. Prior research works show that support vector machines (SVM) achieve the best classification results. We will show that a nearest neighbor classification approach can reach a similar performance or improve the SVM results, given an adequate selection of features of the input...
متن کاملDiscriminant Adaptive Nearest Neighbor Classification and Regression
Robert Tibshirani Department of Statistics University of Toronto tibs@utstat .toronto.edu Nearest neighbor classification expects the class conditional probabilities to be locally constant, and suffers from bias in high dimensions We propose a locally adaptive form of nearest neighbor classification to try to finesse this curse of dimensionality. We use a local linear discriminant analysis to e...
متن کاملLow-Quality Dimension Reduction and High-Dimensional Approximate Nearest Neighbor
The approximate nearest neighbor problem ( -ANN) in Euclidean settings is a fundamental question, which has been addressed by two main approaches: Data-dependent space partitioning techniques perform well when the dimension is relatively low, but are affected by the curse of dimensionality. On the other hand, locality sensitive hashing has polynomial dependence in the dimension, sublinear query...
متن کاملIncremental multi-linear discriminant analysis using canonical correlations for action recognition
Canonical correlations analysis (CCA) is often used for feature extraction and dimensionality reduction. However, the image vectorization in CCA breaks the spatial structure of the original image, and the excessive dimension of vector often brings the curse of dimensionality problem. In this paper, we propose a novel feature extraction method based on CCA in multi-linear discriminant subspace b...
متن کاملThe Fast Johnson-Lindenstrauss Transform and Applications
Dimension reduction is a highly useful tool in algorithm design, with applications in nearest neighbor searching, clustering, streaming, sketching, learning, approximation algorithms, vision and others. It removes redundancy from data and can be plugged into algorithms suffering from a ”curse of dimensionality”. In my talk, I will describe a novel technique for reducing the dimension of points ...
متن کامل